Unified physiological model of audible-visible speech production

نویسندگان

  • Eric Vatikiotis-Bateson
  • Hani Yehia
چکیده

In this paper, vocal tract and orofacial motions are measured during speech production in order to demonstrate that vocal tract motion can be used to estimate its orofacial counterpart. The inversion, i.e. vocal tract behavior estimation from orofacial motion, is also possible, but to a smaller extent. The numerical results showed that vocal tract motion accounted for 96% of the total variance observed in the joint system, whereas orofacial motion accounted for 77%. This analysis is part of a wider study where a dynamical model is being developed to express vocal tract and orofacial motions as a function of muscle activity. This model, currently implemented through multilinear second order autoregressive techniques is described brie y. Finally, the strong direct in uence that vocal tract and facial motions have on the energy of the speech acoustics is exempli ed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low-Audible Speech Detection using Perceptual and Entropy Features

Low-audible speech detection is important since it conveys significant amount of speaker information and understanding. The performance of Automatic Speaker Recognition (ASR) and speaker identification systems drops considerably when low-audible speech is provided as input. In order to improve the performance of such systems, low-audible speech detection is essential. The production, acoustic a...

متن کامل

Audiovisual Speech Recognition with Articulator Positions as Hidden Variables

Speech recognition, by both humans and machines, benefits from visual observation of the face, especially at low signal-to-noise ratios (SNRs). It has often been noticed, however, that the audible and visible correlates of a phoneme may be asynchronous; perhaps for this reason, automatic speech recognition structures that allow asynchrony between the audible phoneme and the visible viseme outpe...

متن کامل

Influenсe of Phone-Viseme Temporal Correlations on Audiovisual STT and TTS Performance

In this paper, we present a research of temporal correlations of audiovisual units in continuous Russian speech. The corpus-based study identifies natural time asynchronies between flows of audible and visible speech modalities partially caused by inertance of the articulation organs. Original methods for speech asynchrony modeling have been proposed and studied using bimodal ASR and TTS system...

متن کامل

Audible Aspects of Speech Preparation

Noises made before the acoustic onset of speech are typically ignored, yet may reveal aspects of speech production planning and be relevant to discourse turn-taking. We quantify the nature and timing of such noises, using an experimental method designed to elicit naturalistic yet controlled speech initiation data. Speakers listened to speech input, then spoke when prompt material became visible...

متن کامل

Perception of Synthesized Audible and Visible Speech

The research reported in this paper uses novel stimuli to study how speech perception is influenced by information presented to ear and eye. Auditory and visual sources ofinformation (syllables) were synthesized and presented in isolation or in factorial combination. A five-step contilllium between the syllables /bal and Idalwas synthesized along both auditory and visual dimensions, by varying ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997